Method and apparatus for generating audio components

ABSTRACT

The method and apparatus of generating a naturally sounding output audio signal ( 120 ) by adding missing output components ( 125 ) in a predetermined first frequency range (R 1 ) to an input signal ( 100 ), set a first output energy measure (S 1 ), over a predetermined first time interval (dt 1 ), of the output components ( 125 ) generated based upon a first input energy measure (E 1 ) calculated over a predetermined second time interval (dt 2 ) of second input components ( 104 ), in a predetermined third frequency range (R 3 ) of the input audio signal ( 100 ).

The invention relates to a method of generating an output audio signalby adding output components in a predetermined first frequency range toan input signal, the output components being generated by performing apredetermined calculation.

The invention also relates to an apparatus for generating outputcomponents in a predetermined first frequency range of an output audiosignal, comprising calculation means for calculating the outputcomponents.

The invention also relates to an audio player, comprising audio datainput means for providing input audio signal, and audio signal outputmeans for outputting a final output audio signal, and containing theapparatus.

The invention also relates to a computer program for execution by aprocessor, describing a method.

The invention also relates to a data carrier storing a computer programfor execution by a processor, the computer program describing themethod.

An embodiment of the method described in the opening paragraph is knownfrom U.S. Pat. No. 6,111,960. The known method generates high frequencyoutput components by applying e.g. a squaring function to firstcomponents in the input signal. E.g., if output components are desiredin a first frequency range between 10 and 12 kHz, they can be generatedby the squaring function which doubles the frequency of first componentsin a predetermined second frequency range between 5 and 6 kHz. This isuseful e.g. when the input audio signal is obtained by decompressingcompressed audio like MP3 audio, in which no high frequency informationis present. The lack of high frequency components results in that theaudio sounds unnatural. The squaring function is a technically simpleway to generate high frequency audio components.

It is a disadvantage of the known method that the output audio signalstill sounds unnatural since the energy of the output components isdirectly determined by the energy of the squared first input components,and hence is not what is to be expected for high frequency components ina natural sound.

It is a first object of the invention to provide a method of the kinddescribed in the opening paragraph, which yields an output audio signalwhich sounds relatively natural. It is a second object to provide anapparatus of the kind described in the opening paragraph, which is ableto perform the method and to yield an output audio signal which soundsrelatively natural.

The first object is realized in that a first output energy measure, overa predetermined first time interval, of the output components generatedis set, based upon a first input energy measure calculated over apredetermined second time interval of second components, in apredetermined third frequency range of the input audio signal. Theinvention is amongst others based on the insight that the energy of highfrequency components in a natural audio signal, and more specificallythe fluctuation pattern of energy in time, is different from the energyof low frequency components. The energy of low frequency componentschanges slowly, whereas the energy of high frequency components changesrapidly. This is due to factors such as e.g. the period of thecomponent, and different reflection and scattering characteristics ofthe environment for different components.

If a component of low frequency is squared, the amplitude of theresulting double frequency component is uniquely determined by theamplitude of the low frequency component. Similarly the energy of outputcomponents is determined by the energy of the first input components.This results in an energy fluctuation pattern for high frequencycomponents which has the characteristics of a fluctuation pattern of lowfrequency components.

The method of the invention sets the energy of the output components,over a first predetermined time interval, which is preferably chosensmall enough to be able to set rapidly fluctuating energy patterns asthey typically occur in the frequency range of the output components, toa more realistic value. This is best done by analyzing the energyfluctuation pattern of the input signal, e.g. of second inputcomponents, in a predetermined third frequency range. Fixed scaling ofoutput components is known from the prior art, but not modulating withthe rapidly fluctuating energy pattern of preselected second inputcomponents.

In an embodiment, the third frequency range is selected from apredetermined number of frequency ranges, as the frequency range whichis closest to the first frequency range according to a predeterminedfrequency range distance formula. Since low, mid and high frequencycomponents generally all show different fluctuation patterns, furtherimproved results are achieved when, the energy of the output componentsis set equal to the energy of components in a frequency close to thefrequency range of the generated output components. E.g. if highfrequencies are missing in the input audio signal and hence aregenerated, the highest frequency range from the number of availablefrequency ranges containing components of the input audio signal willhave the most similar energy fluctuation pattern to what is natural forthe output components.

In a variant on the method or its previous embodiment, the first outputenergy measure is set by further using a second input energy measureover a predetermined third time interval of third input components, in apredetermined fourth frequency range of the input audio signal. Whenmeasuring multiple energies of respective frequency ranges, it becomespossible to even estimate the change of energy fluctuation pattern forsuccessive frequency ranges along the frequency axis. E.g. suppose thatthe fluctuation speed increases linearly from one frequency range to thenext. Then the previous embodiment only performs a so-called zero orderhold estimation of the required energy of the output components, whereaswith two or more energy measurements other estimation possibilities arepossible, such as e.g. a polynomial estimation.

It is advantageous if the predetermined calculation comprises applying anon-linear function to first input components in a predetermined secondfrequency range of an input audio signal. This is a technically simpleway to realize the generation of the output components. Preferably, theinput audio signal is divided in adjacent frequency ranges e.g. by bandfiltering and a non-linear function is applied to the band filteredsignal in each frequency range. Another option is to use a frequencysynthesizer to synthesize output components with a predeterminedamplitude.

The second object is realized in that:

filtering means are comprised for obtaining second input components in athird frequency range of the input audio signal;

energy calculation means are comprised for obtaining a first inputenergy measure over a second predetermined time interval of the secondinput components and deriving therefrom a first output energy measure;and

energy setting means are comprised for setting the energy of the outputcomponents over a first predetermined time interval substantially equalto the first output energy measure.

If in the apparatus the input signal is band filtered by a number ofband pass filters, the energies of the band limited signals outputted bythe filters can be used for obtaining the output energy measures for anumber of frequency ranges containing generated output components.

These and other aspects of the method, the apparatus, the audio player,the computer program and the data carrier according to the inventionwill be apparent from and elucidated with reference to theimplementations and embodiments described hereinafter, and withreference to the accompanying drawings, which serve merely as nonlimiting illustrations.

In the drawings:

FIG. 1 schematically shows an audio signal before and after applying themethod according to the invention;

FIG. 2 schematically shows a flowchart of the method according to theinvention;

FIG. 3 schematically shows a band pass filtered signal in time;

FIG. 4 schematically shows the method according to the invention forreconstructing missing components in a gap between input components;

FIG. 5 schematically shows an apparatus according to the invention;

FIG. 6 schematically shows an audio player.

FIG. 7 schematically shows a data carrier.

In these Figures elements drawn dashed are optional or alternatives.

In FIG. 1, an input audio signal 100 is shown which symbolicallycontains first input components 102 in a second frequency range R2,second input components 104 in a third frequency range R3, and thirdinput components 103 in a fourth frequency range R4. The frequencyranges R2, R3 and R4 are substantially included in a quality frequencyrange O. Input audio signal 100 also contains low quality components 110in a low quality frequency range L, outside quality frequency range O.Such an input audio signal 100 is e.g. the result of decompressing asource of compressed audio, such as MPEG-1 audio layer 3 audio (MP3),advanced audio coding (AAC), windows media audio (WMA) or real audio.

Components are labeled as low quality- or quality-components bydifferent labeling techniques, depending e.g. on the input audio signal100 source, or depending on choices made concerning the realization of aparticular embodiment of the method or apparatus according to theinvention. In a first class of labeling techniques, certain frequencyranges are labeled a priori as quality frequency range O, or vice versaas low quality frequency range L, by a designer of an embodiment. E.g.,it is possible that the source of input audio signal 100 is such, thatthere is no signal present outside quality frequency range O, or thatthere is just noise, which is not related to the input components 102,103, 104 in the quality frequency range O. This occurs e.g. when theinput audio signal 100 is decompressed from an MP3 source, for which achoice was made not to code frequencies above e.g. 11 kHz. For a lowtotal amount of bits available to code an audio signal, e.g. below 64kbps, spending bits on components above 11 kHz would imply that thereare not enough bits for the components below 11 kHz, which results inannoying audible artifacts. Hence components with frequencies higherthan 11 kHz are not coded, and are lost. For this MP3 source, thedesigner labels the components above 11 kHz as low quality components110, and the frequency ranges R2, R3 and R4 are substantially below 11kHz and in the quality frequency range O. A first frequency range R1 canbe designed in such a manner that the method generates output componentsup to e.g. 16 kHz. In other words the designer implements in this wayhis desire that components should exist up to 16 kHz, which areartificially generated in a first frequency range R1 from 11 kHz to 16kHz.

A second class of labeling techniques analyses the input audio signal inreal time. This is realized by means of a quality measure, whichindicates that the quality of components in a low quality frequencyrange L is inferior to the quality of components in the qualityfrequency range O. A possible quality measure is the number of bitsspent on the components in the low quality frequency range, as comparedto a predetermined threshold of bits known to give good perceptualquality. Such a threshold can be determined e.g. by means of listenerpanel tests. In particular if the quality of the components in the lowquality frequency range L is lower than the quality of artificiallygenerated output components 125 according to the method of theinvention, it can be desirable to replace the low quality components 110by the output components 125, at least in a first frequency range R1.

FIG. 1 b shows an output audio signal 120, resulting from applying themethod of the invention. Preferably, the output audio signal 120contains original components 122, which are substantially identical tothe components 102, 103, 104 in the quality frequency range O of theinput audio signal 100. Alternatively, it might be preferable to replacee.g. some of the second input components 104 in the third frequencyrange R3 which are adjacent to the first frequency range R1, so thatthere is a better match between the original components 122 and outputcomponents 125, which are generated by performing a predeterminedcalculation 200 (see FIG. 2), e.g. a synthesis of the output componentswith a predetermined unity amplitude. The input components 102, 103, 104may also undergo a number of predetermined transformations, such asfiltering, before being copied as original components 122.

The output components 125 can be generated by a number of variants ofthe calculation 200. E.g., loss of high frequency components in an MP3coded audio signal is clearly audible, and hence it is preferred thatfrequencies above e.g. 11 kHz are generated. A first variant, which isthe variant of a preferred embodiment of the method—for which acorresponding apparatus is schematically shown in FIG. 5—generates theoutput components 125 on the basis of first input components 102 in apredetermined second frequency range R2 of the input audio signal 100,e.g. by calculation means 506 being a non linear functioncalculation—e.g. on a DSP or as a circuit—which applies a non linearfunction to the first input components 102. When the non linear functionis e.g. a squaring, according to Eq. 1 output components O(t) 125 ofdouble frequency compared to the frequency of the first input componentsI(t) 102 are generated: $\begin{matrix}{{O(t)} = {{f\left\lbrack {{I(t)} = {\sin^{2}{wt}}} \right\rbrack} = {\frac{1}{2}\left( {1 - {\cos\quad 2{wt}}} \right)}}} & \left\lbrack {{Eq}.\quad 1} \right\rbrack\end{matrix}$

Hence when output components in the first frequency range R1 arerequired, a second frequency range R2 can be defined as bounded bybounds of half the frequency of the bounds of R1. Another option is tofilter away second harmonics that are outside the predetermined firstfrequency range R1. Other non-linear functions can generate other higherharmonics, e.g. of triple frequency. An interesting non-linear functionto apply on the first input components 102 is an absolute value.Application of a squaring function has a disadvantage that the amplitudeof the output components 125 is the square of the amplitude of the firstinput components 102, which introduces perceptible artifacts. To correctfor the squared amplitude dependency, a square root of the outputcomponents 125 should preferably be calculated. The squaring and squareroot functions can be combined into an absolute value operation.

A second variant of the calculation 200 does not make use of the firstinput components 102 of the input audio signal 100. When the method isexecuted e.g. on a digital signal processor (DSP), the output componentsare synthesized by signal synthesizer 580 in the first frequency rangewith a predetermined amplitude, as is well known from the art. With thisvariant the input audio signal 100 is not used to generate the outputcomponents 125, but it will be used in the setting part 201 (see FIG. 2)of the method.

In the setting part 201 of the method, a first input energy measure E1is calculated for the second input components 104 over a secondpredetermined time interval dt2 as shown in FIG. 3. The second inputcomponents 104 can be obtained by producing a band limited signal 300,which is a part of the input audio signal 100 restricted to thefrequencies of a third frequency range R3, i.e. obtained e.g. afterfiltering the input audio signal 100 with a band pass filter such as503. The first input energy measure E1 for a certain time instance t isthen e.g. calculated by means of Eq. 2: $\begin{matrix}{{{E\quad 1(t)} = {\int_{t - {{dt2}/2}}^{t + {{dt2}/2}}{{P_{BL}(t)}\quad{\mathbb{d}t}}}},} & \left\lbrack {{Eq}.\quad 2} \right\rbrack\end{matrix}$in which P_(BL) (t) is the instantaneous audio power of the band-limitedsignal 300. Instead of using a multiband decomposition of the inputaudio signal, a discrete Fourier transform can also be used, in whichcase the first input energy measure E1 can be calculated e.g. by meansof Eq. 3: $\begin{matrix}{{{E\quad 1(t)} = {\int_{t - {{dt}\quad{2/2}}}^{t + {{dt}\quad{2/2}}}{\int_{f\quad 3l}^{f\quad 3u}{{P_{BL}\left( {t,f} \right)}\quad{\mathbb{d}f}{\mathbb{d}t}}}}},} & \left\lbrack {{Eq}.\quad 3} \right\rbrack\end{matrix}$in which f3l and f3u are the lower and upper frequency of the thirdfrequency range R3. The second predetermined time interval dt2 should bechosen small enough so that energy fluctuations of the input audiosignal 100 can be accurately tracked. E.g. if the input audio signal 100contains music of which the energy in the third frequency range R3changes appreciably every 100^(th) of a second, the second predeterminedtime interval dt2 should be no larger than a 100^(th) of a second. Fromthe first input energy measure E1 a first output energy measure S1 overa predetermined first time interval dt1 is derived. In a simpleembodiment, the first time interval dt1 equals the second time intervaldt2, and the first output energy measure S1 equals the first inputenergy measure E1.

In an audio signal, components in different frequency ranges showdifferent energy fluctuation patterns. E.g. low frequencies typicallyfluctuate slowly, whereas high frequencies fluctuate rapidly. Since inthe first variant of the calculation 200 the output components 125 arederived from the first input components 102, which in FIG. 1 are lowfrequencies, the energy fluctuation pattern of the output components 125without applying the setting part 201 of the method, is substantiallythe energy fluctuation pattern of the first input components 102, hencetypical of low frequencies, rather than a high frequency energyfluctuation pattern as is expected for a naturally sounding outputsignal 120. Hence to make the output audio signal 120 sound morenatural, the first output energy measure S1(t) has to be set to a valuewhich is more typical of high frequencies. A first output energy measureselection variant has a predetermined number of frequency ranges to itsdisposal, e.g. R2, R3 and R4. The preferred frequency range fordetermining the first output energy measure S1 is the third frequencyrange R3, since it is the one of the predetermined frequencyranges—containing quality audio components—which contains the highestfrequencies. Its energy fluctuation pattern will probably be mostsimilar to a natural energy fluctuation pattern for the even higherfrequencies in the first frequency range R1 of the output components. Ifsecond output components 126 are generated, e.g. by squaring the secondinput components 104 in the third frequency range R3, R3 is again a goodchoice for obtaining its second output energy measure S2(t). In thisvariant, a so called first order hold estimation of the output energymeasures S1, S2 of the output components 125, 126 is employed, by usingthe closest frequency range, namely the third frequency range R3.

For determining which frequency range is the closest, a number offrequency range distance formulae can be used. If the frequency rangesare non-overlapping, the upper and lower bounds can be used forcalculating the distance D, as e.g. in Eqs. 4:D=f _(l) ^(RX) −f _(u) ^(R1) if frequency range RX contains frequencieshigher than in R1 D=f _(l) ^(R1) −f _(u) ^(RX) if RX containsfrequencies lower than in R1 [Eq. 4],in which the indexes 1 and u indicate the lowest resp. highest frequencyin a range. In case overlapping ranges are used, the difference betweenthe median, midpoint or average frequencies for both frequency rangescan be used. The upper and lower bounds can be used for overlappingranges also. The closest frequency range may alternatively be defined apriori by the designer of the method.

FIG. 4 shows a case of an input audio signal 100 for which outputcomponents 125 have to be generated in between two frequency ranges R2and R2′ containing quality audio. R3 and R3′are now candidates for beingthe closest frequency range, which has an energy fluctuation mostsimilar to what is to be expected for the first output energy measureS1(t) of the output components 125 next to them. In case of equaldistance, a heuristic can e.g. prefer the one containing the lowestfrequencies. The output audio signal 120 can be formed by e.g. copyingthe components from the input audio signal 100 in the parts of thefrequency ranges R2 and R2′ outside the first frequency range R1, andgenerating output components in the first frequency range R1 on thebasis of components from R2 and R2′.

Instead of using a zero order hold estimation for the output energymeasures S1 resp. S2 of the output components 125 and 126, more advancedestimations of a natural energy fluctuation pattern for the higherfrequencies can be employed, if a second input energy measure E2 over apredetermined third time interval dt3 of third input components 103, ina predetermined fourth frequency range R4 of the input audio signal 100is measured. If there is e.g. a linear decreasing trend of a timeinterval dtF of fluctuation in the frequency ranges R2, R4 and R3, thistrend can be expected to continue and hence set for R1 and R5. dtF canbe defined e.g. as a time interval in which the input energy measure ofa frequency range as calculated by Eq. 2 has changed by 10%. Thevariation from frequency range to frequency range of other parameterslike the standard deviation of the input energy measure can also betracked and used in setting a naturally sounding energy fluctuationpattern for the higher frequencies, e.g. S1(t) for the output components125. More complicated non-linear estimations can also be employed.

Without departing from the scope of the invention, the setting part 201and calculation 200 could be combined in a single part.

FIG. 5 schematically shows an apparatus 500 according to the invention.It is advantageous, before applying a non linear function to the inputaudio signal 100, e.g. an MP3 stream at 64 kbps upsampled to 44.1 kHz,to obtain output components 125, to first split up the input signal in anumber of band pass filtered subsignals. Eq. 1 is only valid for asingle frequency. If the squaring function is applied to a signalcontaining multiple frequencies, mixing terms are introduced, whichcreates distortion. E.g. in case of music introducing harmonics ofinstruments present is acceptable, but introducing other frequenciesmakes the music sound out of tune. So it is advantageous to applymultiple non-linear functions 506, 507 and 508, on subsignals inadjacent relatively narrow frequency bands created by means of band passfilters 501, 502 and 503. The pass bands of the filters can be chosenaccording to the IEC 1260 standard, containing tierces, e.g. centered at5 kHz, 6.3 kHz and 8 kHz. The filters may be fixed or adaptive, in whichcase a range providing unit 595—e.g. a memory containing a fixed value,or an algorithm supplying a calculated value—may be present. Furtherfilters 509, 510 and 511 may be present to pass signals in thecorresponding double frequency bands 10 kHz, 12.5 kHz and 16 kHz. If thenon linear functions are absolute value functions, many harmonics aregenerated, but only the second harmonic may be desirable since the otherharmonics only distort the output audio signal 120, in which case theother harmonics are filtered out by filters 509, 510 and 511. Thenon-linear functions can be embodied in hardware as in the prior art oras an algorithm running on a DSP. Instead of being a battery of nonlinear functions, the calculation means can also be realized as a signalsynthesizer 580, which is e.g. an algorithm which synthesizes componentsof equal amplitude for all frequencies in the first frequency range R1.Filter 590 generates a band limited signal corresponding to the secondinput components 104, e.g. as a band pass filter, and is connected to afirst energy measuring unit 521, part of an energy calculation unit 525.Alternatively, for reasons of economy, the second input components 104can also be chosen from among the subsignals, e.g. by providing a signalpath 504 between the band limited subsignal outputted by the third bandpass filter 503 and the first energy measuring unit 521. The firstenergy-measuring unit 521 measures the first input energy measure E1,e.g. according to Eq. 2, realized in hardware or software. From thefirst input energy measure E1 a first output energy measure S1 can bederived by an output energy specification unit 520, by means of acalculation, which if desired takes into account further input energymeasures such as a second input energy measure E2, derived by a secondenergy measuring unit 522, on the basis of e.g. the signal outputted bythe second band pass filter 502. A second output energy measure S2 canbe derived in a similar way.

The output components 125 and if desired second output components 126are generated as follows. First intermediate signals 593 resp. 594resulting from calculation means 506 resp. 507, and possibly filtered byfilters 509 resp. 510, are normalized to unit energy by normalizationunits 512 resp. 513. Then energy setting units 515 resp. 516 set theenergy of the output components 125 and second output components 126 tothe desired values S1 resp. S2 at all desired times t. Hence the energysetting units 515 resp. 516 function as amplitude modulators. They canbe realized in software as an algorithm scaling each sample with thefactor S1 resp. S2, or in hardware as a multiplier or a controlledamplifier. The generated output components 125 and second outputcomponents 126 are added by an adder 519 to the quality components ofthe input signal 100. The input signal can optionally be processed by aconditioning unit 540, which e.g. comprises filtering out components inthe low frequency range L.

FIG. 6 shows an example of an audio player 600 in which an apparatusaccording to the invention is comprised. The audio player 600 in FIG. 6is a portable MP3 player, but could also be e.g. an Internet radio.Another product comprising the apparatus or applying the methodaccording to the application is an audio player which generates e.g. aSuper Audio CD (SACD)—like signal from a CD signal. The audio player 600comprises an audio data input 601, e.g. a disk reader, or a connectionto the Internet, from which compressed music is downloaded in a memory.The audio player 600 also comprises an audio signal output 602 foroutputting a final output audio signal 603 after processing, which mayconnect to headphones 604.

It should be noted that the above-mentioned embodiments illustraterather than limit the invention and that those skilled in the art areable to design alternatives, without departing from the scope of theclaims. Apart from combinations of elements of the invention as combinedin the claims, other combinations of the elements within the scope ofthe invention as perceived by one skilled in the art are covered by theinvention. Any combination of elements can be realized in a singlededicated element. Any reference sign between parentheses in the claimis not intended for limiting the claim. The word “comprising” does notexclude the presence of elements or aspects not listed in a claim. Theword “a” or “an” preceding an element does not exclude the presence of aplurality of such elements.

The invention can be implemented by means of hardware or by means ofsoftware running on a computer.

1. A method of generating an output audio signal by adding outputcomponents in a predetermined first frequency range to an input signal,the output components being generated by performing a predeterminedcalculation, characterized in that a first output energy measure, over apredetermined first time interval, of the output components generated isset, based upon a first input energy measure calculated over apredetermined second time interval of second input components, in apredetermined third frequency range of the input audio signal.
 2. Amethod as claimed in claim 1, wherein the third frequency range isselected from a predetermined number of frequency ranges, as thefrequency range which is closest to the first frequency range accordingto a predetermined frequency range distance formula.
 3. A method asclaimed in claim 1, wherein the first output energy measure is set byfurther using a second input energy measure over a predetermined thirdtime interval of third input components, in a predetermined fourthfrequency range of the input audio signal.
 4. A method as claimed inclaim 1, wherein the predetermined calculation comprises applying a nonlinear function to first input components in a predetermined secondfrequency range of an input audio signal.
 5. An apparatus for generatingan output audio signal by adding output components in a predeterminedfirst frequency range to an input audio signal, comprising calculationmeans for calculating the output components, characterized in that:filtering means are comprised for obtaining second input components in athird frequency range of the input audio signal; energy calculationmeans are comprised for obtaining a first input energy measure over asecond predetermined time interval of the second input components andderiving therefrom a first output energy measure; and energy settingmeans are comprised for setting the energy of the output components overa first predetermined time interval substantially equal to the firstoutput energy measure.
 6. An audio player, comprising audio data inputmeans for providing an input audio signal to an apparatus as claimed inclaim 5, the apparatus delivering an output audio signal to signaloutput means.
 7. Computer program for execution by a processor,describing a method as claimed in claim
 1. 8. A data carrier storing acomputer program for execution by a processor, the computer programdescribing a method as claimed in claim 1.